NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Evaluating overinclusive thinking: Development and validation of the Categorical Overinclusive Thinking Task (COverTT)

https://doi.org/10.1016/j.tsc.2024.101726

DiStefano, Paul V; Patterson, John D; Beaty, Roger E (June 2025, Thinking Skills and Creativity)

Free, publicly-accessible full text available June 1, 2026
Associative Thinking and Creative Ability in Older Adulthood

https://doi.org/10.1080/10400419.2024.2443720

Cosgrove, Abigail L; Diaz, Michele T; DiStefano, Paul V; Kenett, Yoed N; Beaty, Roger E (January 2025, Creativity Research Journal)

Full Text Available
Automatic Scoring of Metaphor Creativity with Large Language Models

https://doi.org/10.1080/10400419.2024.2326343

DiStefano, Paul V; Patterson, John D; Beaty, Roger E (March 2024, Creativity Research Journal)

Metaphor is crucial in human cognition and creativity, facilitating abstract thinking, analogical reasoning, and idea generation. Typically, human raters manually score the originality of responses to creative thinking tasks – a laborious and error-prone process. Previous research sought to remedy these risks by scoring creativity tasks automatically using semantic distance and large language models (LLMs). Here, we extend research on automatic creativity scoring to metaphor generation – the ability to creatively describe episodes and concepts using nonliteral language. Metaphor is arguably more abstract and naturalistic than prior targets of automated creativity assessment. We collected 4,589 responses from 1,546 participants to various metaphor prompts and corresponding human creativity ratings. We fine-tuned two open-source LLMs (RoBERTa and GPT-2) – effectively “teaching” them to score metaphors like humans – before testing their ability to accurately assess the creativity of new metaphors. Results showed both models reliably predicted new human creativity ratings (RoBERTa r = .72, GPT-2 r = .70), significantly more strongly than semantic distance (r = .42). Importantly, the fine-tuned models generalized accurately to metaphor prompts they had not been trained on (RoBERTa r = .68, GPT-2 r = .63). We provide open access to the fine-tuned models, allowing researchers to assess metaphor creativity in a reproducible and timely manner.
more » « less
Full Text Available
Automated Scoring of Scientific Creativity in German

https://doi.org/10.1002/jocb.658

Goecke, Benjamin; DiStefano, Paul V; Aschauer, Wolfgang; Haim, Kurt; Beaty, Roger; Forthmann, Boris (May 2024, The Journal of Creative Behavior)

ABSTRACT Automated scoring is a current hot topic in creativity research. However, most research has focused on the English language and popular verbal creative thinking tasks, such as the alternate uses task. Therefore, in this study, we present a large language model approach for automated scoring of a scientific creative thinking task that assesses divergent ideation in experimental tasks in the German language. Participants are required to generate alternative explanations for an empirical observation. This work analyzed a total of 13,423 unique responses. To predict human ratings of originality, we used XLM‐RoBERTa (Cross‐lingual Language Model‐RoBERTa), a large, multilingual model. The prediction model was trained on 9,400 responses. Results showed a strong correlation between model predictions and human ratings in a held‐out test set (n = 2,682;r = 0.80; CI‐95% [0.79, 0.81]). These promising findings underscore the potential of large language models for automated scoring of scientific creative thinking in the German language. We encourage researchers to further investigate automated scoring of other domain‐specific creative thinking tasks.
more » « less
Full Text Available

Search for: All records